11 research outputs found

    Prediction of DNase I Hypersensitive Sites by Using Pseudo Nucleotide Compositions

    Get PDF
    DNase I hypersensitive sites (DHS) associated with a wide variety of regulatory DNA elements. Knowledge about the locations of DHS is helpful for deciphering the function of noncoding genomic regions. With the acceleration of genome sequences in the postgenomic age, it is highly desired to develop cost-effective computational methods to identify DHS. In the present work, a support vector machine based model was proposed to identify DHS by using the pseudo dinucleotide composition. In the jackknife test, the proposed model obtained an accuracy of 83%, which is competitive with that of the existing method. This result suggests that the proposed model may become a useful tool for DHS identifications

    Classifying Included and Excluded Exons in Exon Skipping Event Using Histone Modifications

    Get PDF
    Alternative splicing (AS) not only ensures the diversity of gene expression products, but also closely correlated with genetic diseases. Therefore, knowledge about regulatory mechanisms of AS will provide useful clues for understanding its biological functions. In the current study, a random forest based method was developed to classify included and excluded exons in exon skipping event. In this method, the samples in the dataset were encoded by using optimal histone modification features which were optimized by using the Maximum Relevance Maximum Distance (MRMD) feature selection technique. The proposed method obtained an accuracy of 72.91% in 10-fold cross validation test and outperformed existing methods. Meanwhile, we also systematically analyzed the distribution of histone modifications between included and excluded exons and discovered their preference in both kinds of exons, which might provide insights into researches on the regulatory mechanisms of alternative splicing

    Predicting Antimicrobial Peptides by Using Increment of Diversity with Quadratic Discriminant Analysis Method

    No full text

    Predicting the Types of J-Proteins Using Clustered Amino Acids

    No full text
    J-proteins are molecular chaperones and present in a wide variety of organisms from prokaryote to eukaryote. Based on their domain organizations, J-proteins can be classified into 4 types, that is, Type I, Type II, Type III, and Type IV. Different types of J-proteins play distinct roles in influencing cancer properties and cell death. Thus, reliably annotating the types of J-proteins is essential to better understand their molecular functions. In the present work, a support vector machine based method was developed to identify the types of J-proteins using the tripeptide composition of reduced amino acid alphabet. In the jackknife cross-validation, the maximum overall accuracy of 94% was achieved on a stringent benchmark dataset. We also analyzed the amino acid compositions by using analysis of variance and found the distinct distributions of amino acids in each family of the J-proteins. To enhance the value of the practical applications of the proposed model, an online web server was developed and can be freely accessed

    Identifying RNA N6-Methyladenosine Sites in Escherichia coli Genome

    No full text
    N6-methyladenosine (m6A) plays important roles in a branch of biological and physiological processes. Accurate identification of m6A sites is especially helpful for understanding their biological functions. Since the wet-lab techniques are still expensive and time-consuming, it's urgent to develop computational methods to identify m6A sites from primary RNA sequences. Although there are some computational methods for identifying m6A sites, no methods whatsoever are available for detecting m6A sites in microbial genomes. In this study, we developed a computational method for identifying m6A sites in Escherichia coli genome. The accuracies obtained by the proposed method are >90% in both 10-fold cross-validation test and independent dataset test, indicating that the proposed method holds the high potential to become a useful tool for the identification of m6A sites in microbial genomes

    Benchmark data for identifying N6-methyladenosine sites in the Saccharomyces cerevisiae genome

    Get PDF
    This data article contains the benchmark dataset for training and testing iRNA-Methyl, a web-server predictor for identifying N6-methyladenosine sites in RNA (Chen et al., 2015 [15]). It can also be used to develop other predictors for identifying N6-methyladenosine sites in the Saccharomyces cerevisiae genome

    iRNA-PseColl: Identifying the Occurrence Sites of Different RNA Modifications by Incorporating Collective Effects of Nucleotides into PseKNC

    No full text
    There are many different types of RNA modifications, which are essential for numerous biological processes. Knowledge about the occurrence sites of RNA modifications in its sequence is a key for in-depth understanding of their biological functions and mechanism. Unfortunately, it is both time-consuming and laborious to determine these sites purely by experiments alone. Although some computational methods were developed in this regard, each one could only be used to deal with some type of modification individually. To our knowledge, no method has thus far been developed that can identify the occurrence sites for several different types of RNA modifications with one seamless package or platform. To address such a challenge, a novel platform called “iRNA-PseColl” has been developed. It was formed by incorporating both the individual and collective features of the sequence elements into the general pseudo K-tuple nucleotide composition (PseKNC) of RNA via the chemicophysical properties and density distribution of its constituent nucleotides. Rigorous cross-validations have indicated that the anticipated success rates achieved by the proposed platform are quite high. To maximize the convenience for most experimental biologists, the platform’s web-server has been provided at http://lin.uestc.edu.cn/server/iRNA-PseColl along with a step-by-step user guide that will allow users to easily achieve their desired results without the need to go through the mathematical details involved in this paper

    Identification of D Modification Sites by Integrating Heterogeneous Features in <i>Saccharomyces cerevisiae</i>

    No full text
    As an abundant post-transcriptional modification, dihydrouridine (D) has been found in transfer RNA (tRNA) from bacteria, eukaryotes, and archaea. Nonetheless, knowledge of the exact biochemical roles of dihydrouridine in mediating tRNA function is still limited. Accurate identification of the position of D sites is essential for understanding their functions. Therefore, it is desirable to develop novel methods to identify D sites. In this study, an ensemble classifier was proposed for the detection of D modification sites in the Saccharomyces cerevisiae transcriptome by using heterogeneous features. The jackknife test results demonstrate that the proposed predictor is promising for the identification of D modification sites. It is anticipated that the proposed method can be widely used for identifying D modification sites in tRNA

    Data_Sheet_1_Identifying RNA N6-Methyladenosine Sites in Escherichia coli Genome.XLSX

    No full text
    <p>N<sup>6</sup>-methyladenosine (m<sup>6</sup>A) plays important roles in a branch of biological and physiological processes. Accurate identification of m<sup>6</sup>A sites is especially helpful for understanding their biological functions. Since the wet-lab techniques are still expensive and time-consuming, it's urgent to develop computational methods to identify m<sup>6</sup>A sites from primary RNA sequences. Although there are some computational methods for identifying m<sup>6</sup>A sites, no methods whatsoever are available for detecting m<sup>6</sup>A sites in microbial genomes. In this study, we developed a computational method for identifying m<sup>6</sup>A sites in Escherichia coli genome. The accuracies obtained by the proposed method are >90% in both 10-fold cross-validation test and independent dataset test, indicating that the proposed method holds the high potential to become a useful tool for the identification of m<sup>6</sup>A sites in microbial genomes.</p
    corecore